Genetics of common complex psychiatric disorders

Mark Adams

Part 1: Biometrics

Mark Adams
Division of Psychiatry
mark.adams@ed.ac.uk
Genetics and Environmental Influences on Behaviour and Mental Health

What is a “common”, “complex” psychiatric disorder?

Common: Affects 1% or more of the population
Complex: Inheritance cannot be explained by a single gene

🧬👪🚬💢🏡💞🩻🏫

  • Depression: 3% in a week
  • Schizophrenia: 1% in lifetime
  • Bipolar disorder: 2% in lifetime
  • Anxiety disorder: 6% in a week

Why genetics?

Why use genetics to study mental health and psychiatric disorders?

  • Biological understanding of genes, pathways
  • Shared aetiology with other disorders
  • Risk prediction
  • Drug retargeting
  • Causal analysis of environmental risk factors

Genetics of categorical traits

Diagram showing the seven “characters” observed by Mendel

Genetics of continuous traits

Reconciling categorical + continuous genetics = quantitative genetics

Polygenic traits are quantitative traits

Adding up effects from a large number of genetic effects to make a continuous phenotype is related to the Central Limit Theorem.

Biometrics

What are the sources of family resemblance? How do we quantify them numerically?

Heritability

Proportion of similarity in phenotypes that can be attributed to similarity in genotypes.

Model: Phenotype (P) = Genotype (G) + Environment (E)
Variance decomposition \[\mathrm{var}(P) = \mathrm{var}(G) + \mathrm{var}(𝐸)\]

Proportion of variance \[H^2 = \frac{\mathrm{var}(G)}{\mathrm{var}(𝑃)}, e^2 = \frac{\mathrm{var}(E)}{\mathrm{var}(𝑃)}, H^2 + e^2 = 1\]

How to estimate heritability from data

Plot of child (offspring) height versus the average of their parents’ heights. What is a statistic that can be used to summarise the relationship between these two variables?

How to estimate heritability from data

\(\beta = \frac{\mathrm{cov}(X, Y)}{\mathrm{var}(X)}\)

Estimate the beta coefficient (slope) for a simple regression from the covariance between predictor (\(X\)) and outcome (\(Y\)) variable divided by the variance of the predictor (\(X\)).

Simple model of genetic and environmental effects

\[ P = A + E \]

The phenotype value \(P\) is influenced by an additive genetic effect \(A\) and and environmental effect \(E\).

Simple model of genomics

\[ A = d + s \]

Each individual has two copies of the genome, one inherited from each parent.

Simple model of inheritance

Simple model of genetics, environment, and inheritance

Phenotype (\(P\)) value is the sum of the two genetic values plus an environmental value (\(e\)).

  • Mother’s phenotype: \(P_d = d + d^\prime + e_d\)
  • Father’s phenotype: \(P_s = s + s^\prime + e_s\)
  • Child’s phenotype: \(P_o = d + s + e_o\)

Regression equation

\(\beta = \frac{\mathrm{cov}(X, Y)}{\mathrm{var}(X)}\)

  • \(X\) = average of parents’ phenotypes
  • \(Y\) = offspring phenotype

Therefore, \(\beta = \frac{\mathrm{cov}(\frac{P_d + P_s}{2}, P_o)}{\mathrm{var}(\frac{P_d + P_s}{2})}\)

Parent–offspring covariance

\[ \mathrm{cov}(\frac{P_d + P_s}{2}, P_o) \]

\[ = \mathrm{cov}(\frac{d + d^\prime + e_d + s + s^\prime + e_s}{2}, d + s + e_o) \]

Parent-offspring covariance

Expand the terms. Recall that:

\[ \mathrm{cov}(A+X,B+Y) = \\ \mathrm{cov}(A,B) + \mathrm{cov}(A,Y) + \mathrm{cov}(X,B) + \mathrm{cov}(X,Y) \] Thus we can do a pairwise expansion to: \[ = \mathrm{cov}(\frac{d}{2} + \frac{d^\prime}{2} + \frac{e_d}{2} + \frac{s}{2} + \frac{s^\prime}{2} + \frac{e_s}{2}, d + s + e_o) \] \[ = \mathrm{cov}(\frac{d}{2}, d) + \mathrm{cov}(\frac{d^\prime}{2}, d) + \dotsm+ \mathrm{cov}(\frac{e_s}{2}, e_o) \]

Simplifications

Some terms can be simplified.

Covariance between a genetic effect and itself \[ \mathrm{cov}(\frac{d}{2}, d), \mathrm{cov}(\frac{s}{2}, s) \]

Simplifies to:

\[ \mathrm{cov}(\frac{d}{2}, d) = \frac{1}{2}\mathrm{cov}(d, d) = \frac{1}{2}\mathrm{var}(d) \] \[ \mathrm{cov}(\frac{s}{2}, s) = \frac{1}{2}\mathrm{cov}(s, s) = \frac{1}{2}\mathrm{var}(s) \]

Assumptions

For some terms we might make an assumption that they are equal to 0.

Covariance between genetic effects from the same parent \[ \mathrm{cov}(\frac{d^\prime}{2}, d), \mathrm{cov}(\frac{s^\prime}{2}, s) \]

Covariance between genetic effects from different parents \[ \mathrm{cov}(\frac{d^\prime}{2}, s), \mathrm{cov}(\frac{s^\prime}{2}, d) \]

Covariance between parent and offspring environment effects \[ \mathrm{cov}(\frac{e_d}{2}, e_o), \mathrm{cov}(\frac{e_s}{2}, e_o) \]

Covariance between parental genetic and offspring environmental effects \[ \mathrm{cov}(\frac{d}{2}, e_o), \mathrm{cov}(\frac{s}{2}, e_o) \]

Using those assumptions the parent–offspring covariance simplifies to

\[ \mathrm{cov}(\frac{P_d + P_s}{2}, P_o) = \frac{\mathrm{var}(d) + \mathrm{var}(s)}{2} \]

Parent variance

The denominator in the regression equation was \[ \mathrm{var}(\frac{P_d + P_s}{2}) \]

Using the identity \[ \mathrm{var}(aX + bY) = a^2\mathrm{var}(X) + b^2\mathrm{var}(Y) + 2ab\mathrm{cov}(X, Y) \] the variance of the average parental phenotypes is: \[ \mathrm{var}(\frac{P_d + P_s}{2}) = \mathrm{var}(\frac{1}{2}P_d + \frac{1}{2} P_s) \] \[ = \left(\frac{1}{2}\right)^2\mathrm{var}(P_d) + \left(\frac{1}{2}\right)^2\mathrm{var}(P_s) + 2 \cdot \frac{1}{2} \cdot \frac{1}{2} \mathrm{cov}(P_d, P_s) \] \[ = \frac{1}{4}\mathrm{var}(P_d) + \frac{1}{4}\mathrm{var}(P_s) + \frac{1}{2} \mathrm{cov}(P_d, P_s) \]

If we assume as above that there is no covariation between parental effects (\(\mathrm{cov}(P_d, P_s) = 0\)), this simplifies to

\[ = \frac{\mathrm{var}(P_d) + \mathrm{var}(P_s)}{4} \]

Thus the regression equation is:

\[ \beta = \frac{\mathrm{cov}(\frac{P_d + P_s}{2}, P_o)}{\mathrm{var}(\frac{P_d + P_s}{2})} \\ = \frac{\frac{\mathrm{var}(d) + \mathrm{var}(s)}{2}}{\frac{\mathrm{var}(P_d) + \mathrm{var}(P_s)}{4}} \\ = 2\frac{\mathrm{var}(d) + \mathrm{var}(s)}{\mathrm{var}(P_d) + \mathrm{var}(P_s)} \]

Previously we defined

\[ A = d + s \] thus \[ \mathrm{var}(A) = \mathrm{var}(d) + \mathrm{var}(s) \] and assume variances in parental phenotypes are equal \[ \mathrm{var}(P_d) = \mathrm{var}(P_s) = \mathrm{var}(P) \]

Then substitute into the regression equation

\[ \beta = 2\frac{\mathrm{var}(d) + \mathrm{var}(s)}{\mathrm{var}(P_d) + \mathrm{var}(P_s)} \\ = 2 \frac{\mathrm{var}(A)}{\mathrm{var}(P) + \mathrm{var}(P)} \\ = 2 \frac{\mathrm{var}(A)}{2 \mathrm{var}(P)} \\ = \frac{\mathrm{var}(A)}{\mathrm{var}(P)} \\ = h^2 \]

Height data

  • \(\mathrm{cov}(\frac{P_d + P_s}{2}, P_o) =\) 12.57
  • \(\mathrm{var}(\frac{P_d + P_s}{2}) =\) 22.04
  • \(\hat{h}^2 =\) 12.57 / 22.04 = 0.57

Parent and offspring phenotypes become more highly correlated as heritability increases.

Mini review: What assumptions have we made when estimating \(h^2\)?

Generalising to other relatives

Heritability can also be estimated from resemblance between different types of related pairs. The general equation is:

\[ h^2 = \frac{b}{\mathrm{r}} \]

  • \(b\) = regression coefficient
  • \(\mathrm{r}\) = relatedness coefficient (“coefficient of additive variance”)

Example data: depression scores

Correlation of depression scores for different pairs of relatives

Recurrance risk to relatives

\[ \lambda_\mathrm{R} = \frac{P(\mathrm{affected} | \mathrm{relative\ affected})}{P(\mathrm{affected\ in\ population})} = \frac{K_\mathrm{R}}{K} \]

Example:

  • \(K_\mathrm{sib} = P(\mathrm{affected} | \mathrm{sibling\ affected}) = 0.09\)
  • \(K = P(\mathrm{affected\ in\ population}) = 0.02\)
  • \(\frac{K_\mathrm{sib}}{K} = \frac{0.09}{0.02} = 4.5\)

Recurrance risk for schizophrenia

Recurrance risk and heritability

  • Code \(\mathrm{unaffected} = 0, \mathrm{unaffected} = 1\)
  • If population prevalence is \(K\), then phenotypic variance is \(K(1-K)\) (Bernoulli distribution)

  • \(Y\) = score of individual (proband)
  • \(Y_\mathrm{R}\) = score of relative of proband
  • Expectation: \(E[Y] = E[Y_\mathrm{R}] = K\)
  • \(K_\mathrm{R} = E[Y_\mathrm{R} | Y = 1]\)
  • Probability that both \(Y\) and \(Y_\mathrm{R}\) = 1: \(E[YY_\mathrm{R}] = K \times K_\mathrm{R}\)

\[ \mathrm{cov}(Y, Y_\mathrm{R}) = E[YY_\mathrm{R}] - E[Y] E[Y_\mathrm{R}] \\ = K \times K_\mathrm{R} - K^2 \\ \]

\[ \mathrm{cov}(Y, Y_\mathrm{R}) = E[YY_\mathrm{R}] - E[Y] E[Y_\mathrm{R}] \]

\[ = K \times K_\mathrm{R} - K^2 \\ = K(K_\mathrm{R} - K) \\ = K^2 (\frac{K_\mathrm{R}}{K} - 1) \\ = K^2 (\lambda_\mathrm{R} - 1) \]

Heritability estimate

\[ h^2 = \frac{\mathrm{cov}_\mathrm{R}}{\mathrm{r}V_\mathrm{P}} \\ = \frac{K^2 (\lambda_\mathrm{R} - 1)}{\mathrm{r}K(1-K)} \\ = \frac{K (\lambda_\mathrm{R} - 1)}{\mathrm{r}(1-K)} \\ \]

Estimating environmental effects

Contrast pairs of relatives that have comparable environmental similarity but different genetic similarity.

  • Monozygotic (MZ) twins \(\mathrm{r} = 1.0\)
  • Dizygotic (DZ) twins \(\mathrm{r} = 0.5\)

Additive genetic and shared environment effects

Add a shared (\(C\) or “common”) environment to the basic genetic model, to capture similarity between relatives attributable to environmental factors. \(E\) represents the unique, non-shared environment.

\[P = A + C + E\]

\[h^2 = \frac{\mathrm{var}(A)}{\mathrm{var}(P)}, c^2 = \frac{\mathrm{var}(C)}{\mathrm{var}(P)}, e^2 = \frac{\mathrm{var}(E)}{\mathrm{var}(P)}\]

\[h^2 + c^2 + e^2 = 1\]

Twin correlations

MZ twins: \(r_\mathrm{MZ} = h^2 + c^2\)

DZ twins: \(r_\mathrm{DZ} = \frac{1}{2}h^2 + c^2\)

Solve for genetic similarity (\(h^2\))

Calculate difference between MZ and DZ correlations

\[ r_\mathrm{MZ} - r_\mathrm{DZ} = (h^2 + c^2) - (\frac{1}{2}h^2 + c^2) \]

\[ r_\mathrm{MZ} - r_\mathrm{DZ} = h^2 - \frac{1}{2}h^2 + c^2 - c^2 \]

\[ r_\mathrm{MZ} - r_\mathrm{DZ} = \frac{1}{2}h^2 \]

\[ h^2 = 2(r_\mathrm{MZ} - r_\mathrm{DZ}) \]

Substitute \(h^2\) into MZ equation and solve for shared environment similarity (\(c^2\))

\[ r_\mathrm{MZ} = \underbrace{h^2} + c^2 \\ r_\mathrm{MZ} = 2(r_\mathrm{MZ} - r_\mathrm{DZ}) + c^2 \\ r_\mathrm{MZ} - 2(r_\mathrm{MZ} - r_\mathrm{DZ}) = c^2 \\ c^2 = r_\mathrm{MZ} - 2r_\mathrm{MZ} + 2r_\mathrm{DZ} \\ c^2 = 2r_\mathrm{DZ} - r_\mathrm{MZ} \]

Therefore from MZ and DZ twin correlations we can estimate:

\[ h^2 = 2(r_\mathrm{MZ} - r_\mathrm{DZ}) \\ c^2 = 2r_\mathrm{DZ} - r_\mathrm{MZ} \\ e^2 = 1 - h^2 - c^2 \]

Visualisation with r[MZ] = 0.75 and r[DZ] = 0.5.

What do we know about psychiatric genetics from twins studies

Meta-analysis of twin heritability